Aspiration Noise during Phonation: Synthesis, Analysis, and Pitch-scale Modification

نویسندگان

  • DARYUSH MEHTA
  • Daryush Mehta
  • Thomas F. Quatieri
چکیده

The current study investigates the synthesis and analysis of aspiration noise in synthesized and spoken vowels. Based on the linear source-filter model of speech production, we implement a vowel synthesizer in which the aspiration noise source is temporally modulated by the periodic source waveform. Modulations in the noise source waveform and their synchrony with the periodic source are shown to be salient for natural-sounding vowel synthesis. After developing the synthesis framework, we research past approaches to separate the two additive components of the model. A challenge for analysis based on this model is the accurate estimation of the aspiration noise component that contains energy across the frequency spectrum and temporal characteristics due to modulations in the noise source. Spectral harmonic/noise component analysis of spoken vowels shows evidence of noise modulations with peaks in the estimated noise source component synchronous with both the open phase of the periodic source and with time instants of glottal closure. Inspired by this observation of natural modulations in the aspiration noise source, we develop an alternate approach to the speech signal processing aim of accurate pitch-scale modification. The proposed strategy takes a dual processing approach, in which the periodic and noise components of the speech signal are separately analyzed, modified, and re-synthesized. The periodic component is modified using our implementation of time-domain pitch-synchronous overlap-add, and the noise component is handled by modifying characteristics of its source waveform. Since we have modeled an inherent coupling between the original periodic and aspiration noise sources, the modification algorithm is designed to preserve the synchrony between temporal modulations of the two sources. The reconstructed modified signal is perceived to be natural-sounding and generally reduces artifacts that are typically heard in current modification techniques. Thesis Supervisor: Thomas F. Quatieri Title: Senior Member of Technical Staff, MIT Lincoln Laboratory Faculty of MIT Speech and Hearing Bioscience and Technology Program

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modelling aspiration noise during phonation using the LF voice source model

This paper presents a technique for modelling the aspiration noise produced during phonation. The method employs the widely used LF voice source model, which is here extended to include a turbulence noise source for the generation of aspiration. Drawing on speech production theory and on empirical data, the overall amplitude level as well as the within-pulse modulation of the noise are determin...

متن کامل

Glottal source modeling for singing voice synthesis

Naturalness of sound quality is essential for singing-voice synthesis. Since 95% of singing is voiced sound (Cook, 1990), the focus of this paper is to improve the naturalness of the vowel tone quality via glottal excitation modeling. We propose to use the LF-model (Fant et al., 1985) for the glottal wave shape in conjunction with pitch-synchronous, amplitude-modulated Gaussian noise, which add...

متن کامل

Pitch-Scale Modification using the Mod

Spectral harmonic/noise component analysis of spoken vowels shows evidence of noise modulations with peaks in the estimated noise source component synchronous with both the open phase of the periodic source and with time instants of glottal closure. Inspired by this observation of natural modulations and of fullband energy in the aspiration noise source, we develop an alternate approach to high...

متن کامل

Analysis, synthesis, and perception of voice quality variations among female and male talkers.

Voice quality variations include a set of voicing sound source modifications ranging from laryngealized to normal to breathy phonation. Analysis of reiterant imitations of two sentences by ten female and six male talkers has shown that the potential acoustic cues to this type of voice quality variation include: (1) increases to the relative amplitude of the fundamental frequency component as op...

متن کامل

A hybrid method oriented to concatenative text-to-speech synthesis

In this paper we present a speech synthesis method for diphonebased text-to-speech systems. Its main goal is to achieve prosodic modifications that result in more natural-sounding synthetic speech. This improvement is especially useful for emotional speech synthesis, which requires high-quality prosodic modification. We present a hybrid method based on TD-PSOLA and the harmonic plus noise model...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006